Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval
نویسندگان
چکیده
This paper first shows that the bucket allocation problem of an MKH (Multiple Key Hashing) file for partial match retrieval can be reduced to that of a smaller sized subfile, called the remainder of the file. And it is pointed out that the remainder type MKH file is the hardest MKH file for which to design an optimal allocation scheme. We then particularly concentrate on the allocation of an important remainder type MKH file; namely, the k-ary MKH file. We present various sufficient conditions on the number of available disks and the number of attributes for a k-ary MKH file to have a perfectly optimal allocation among the disks for partial match queries. Based upon these perfectly optimal allocations, we further present a heuristic method, called the CH (Cyclic Hashing) method, to produce near optimal allocations for the general k-ary MKH files. Finally, a comparison, by experiment, between the performances of the proposed method and an “ideal” perfectly optimal method, shows that the CH method is indeed satisfactorily good for the general k-ary MKH files.
منابع مشابه
An Optimal Disk Allocation Strategy for Partial Match Queries on Non-Uniform Cartesian Product Files
The disk allocation problem addresses the issue of how to distribute a file on to several disks to maximize the concurrent disk accesses in response to a partial match query. In the past this problem has been studied for binary as well as for p-ary cartesian product files. In this paper, we propose a disk allocation strategy for non-uniform cartesian product files by a coding theoretic approach...
متن کاملLoad Balanced and Optimal Disk Allocation Strategy for Partial Match Queries on Multidimensional Files
A multidimensional file is one whose data are characterized by several attributes, each specified in a given domain. A partial match query on a multidimensional file extracts all data whose attributes match the values of one or more attributes specified in the query. The disk allocation problem of a multidimensional file F on a database system with multiple disks accessible in parallel is the p...
متن کاملOptimal Linear Hashing Files for Orthogonal Range Retrieval
In this paper, we are concerned with the problem of designing optimal linear hashing files for orthogonal range retrieval. Through the study of performance expressions, we show that optimal basic linear hashing files and optimal recursive linear hashing files for orthogonal range retrieval can be produced, in certain cases, by a greedy method called the MMI (minimum marginal increase) method; a...
متن کاملMultidisk Partial Match File Design with Known Access Pattern
The problem of multidisk partial match file design is a file allocation problem among multiple independently accessible disks so that, for all possible partial match queries, maximal disk access concurrency can be obtained. Since this problem has been shown to be NP-hard, there are a few heuristic methods based upon the Disk Modulo (DM) allocation method [5,6,9]. The file systems proposed for t...
متن کاملP´olya Urn Models and Connections to Random Trees: A Review
This paper reviews P´olya urn models and their connection to random trees. Basic results are presented, together with proofs that underly the historical evolution of the accompanying thought process. Extensions and generalizations are given according to chronology: • P´olya-Eggenberger’s urn • Bernard Friedman’s urn • Generalized P´olya urns • Extended urn schemes • Invertible urn schemes ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 9 شماره
صفحات -
تاریخ انتشار 1997